Skip to content

ref: Unify rendering pipeline with structured output and snapshot tests#318

Closed
cameroncooke wants to merge 50 commits intomainfrom
refactor/rendering-pipeline
Closed

ref: Unify rendering pipeline with structured output and snapshot tests#318
cameroncooke wants to merge 50 commits intomainfrom
refactor/rendering-pipeline

Conversation

@cameroncooke
Copy link
Copy Markdown
Collaborator

@cameroncooke cameroncooke commented Apr 8, 2026

Summary

Major refactor of how all tool and resource output is produced across both MCP and CLI transports. Previously, rendering logic was scattered across individual tool handlers — each tool constructed its own ToolResponse directly, leading to inconsistent output between MCP and CLI, duplicated formatting, and no way to verify output without manual testing.

This PR introduces a unified rendering pipeline with a clean separation: tools emit structured events, a render session produces text, and protocol-specific result types are constructed only at the transport boundary. Snapshot tests lock down the output contract so regressions are caught automatically.

Architecture

tool handler → ctx.emit(PipelineEvent) → RenderSession → transport boundary
  • Tools emit events, never construct results. Every code path (success, error, early return) emits PipelineEvent objects through a handler context. The render session owns the result.
  • Rendering is pure. Events in, strings out. No knowledge of stdout, wire format, or protocol types.
  • Transport boundary constructs protocol types. MCP wraps rendered text into response objects. CLI writes to stdout. Daemon sends raw events over the wire for the CLI to render locally.
  • Streaming via the same path. CLI wraps the session to write each event's rendered text to stdout as it arrives. Xcodebuild pipeline streams parsed build events through ctx.emit in real-time.

Render strategies

Two strategies — text (human-readable) and json (JSONL) — selected at the boundary. The text strategy uses existing formatters from event-formatting.ts. The JSON strategy is JSON.stringify(event) + '\n'.

Boundary flows

Transport What happens
MCP Creates render session, executes tool, finalizes session, wraps text + image attachments into protocol response (constructed here only)
CLI direct Creates render session with streaming emitter that writes to stdout as events arrive, finalizes after tool completes
Daemon server Records events, sends them over the wire (no rendering)
CLI via daemon Receives events from daemon, renders locally through the same CLI path

Key changes

  • All ~50 tool handlers migrated from return toolResponse([...]) to ctx.emit(event) calls via typed tool factories
  • Centralized error handling across all tool logic functions
  • Extracted shared orchestration primitives (simulator-steps, macos-steps, test-preflight, swift-test-discovery) to deduplicate build/run/test workflows
  • Structured xcodebuild output parsing with dedicated event parser and line-parser modules
  • Integrated log capture directly into build/run tools; removed standalone logging tools (start_sim_log_cap, stop_sim_log_cap, start_device_log_cap, stop_device_log_cap, launch_app_logs_sim)
  • Manifest-driven MCP resources with predicate-aware registration
  • Daemon protocol bumped to v2 (events over wire instead of rendered text)
  • Home directory paths collapsed to ~/ in output for readability
  • Snapshot tests added for tool and resource output with system-agnostic normalization

- Migrate all test tools to canonical single-pipeline pattern with
  createPendingXcodebuildResponse (test_sim, test_device, test_macos)
- Add test discovery preflight to device and macOS test tools
- Migrate 5 query tools to consistent formatting with front matter,
  structured error blocks, and clean output (list_schemes,
  show_build_settings, get_sim_app_path, get_mac_app_path,
  get_device_app_path)
- Add manifest-level when field for next step templates
  (success/failure/always) to control when next steps appear
- Group warnings before summary with yellow triangle symbol
- Fix compiler error rendering for multi-line errors
- Fix test failure rendering to use consistent indented format with
  relative file paths
- Fix linker warning rendering to use structured format
- Remove dead code (legacy finalization helpers, dead modules,
  unused DI patterns)
- Extract duplicate deriveDiagnosticBaseDir to shared module
- Create shared xcodebuild-error-utils for query tool error parsing
Replace ad-hoc text formatting across all tools with a single structured
event pipeline. Every tool now emits typed PipelineEvent objects (header,
statusLine, section, detailTree, table, fileRef, summary) that flow
through shared renderers for consistent output formatting.

Key changes:

- Add PipelineEvent union type replacing XcodebuildEvent with canonical
  types (HeaderEvent, StatusLineEvent, SectionEvent, DetailTreeEvent,
  TableEvent, FileRefEvent) plus xcodebuild-specific types (BuildStageEvent,
  CompilerWarningEvent, CompilerErrorEvent, test events)

- Add toolResponse() entry point that feeds events through shared
  resolveRenderers() pipeline (MCP buffer + CLI stdout)

- Migrate all ~50 tool files to use toolResponse([events]) instead of
  createTextResponse/createErrorResponse/ad-hoc content construction

- Migrate infrastructure (typed-tool-factory, tool-invoker, simulator-utils,
  axe-helpers, ui-automation-guard, validation, xcode-tools-bridge) to route
  all errors through the pipeline

- Delete createErrorResponse, createTextResponse, consolidateContentForClaudeCode
  as there is now a single call site for response construction (toolResponse)

- Default manifest nextSteps to when:success so next steps never appear
  on error paths

- Fix daemon-routed tools: only set pipelineStreamMode:'complete' when a
  CLI renderer actually wrote to stdout

- Regenerate all 74 snapshot fixtures with consistent pipeline formatting
Add vitest.snapshot.config.ts for snapshot integration tests. Exclude
src/snapshot-tests/ from the default vitest config so `npm test` runs
only unit tests. Update package.json scripts so each test suite has
its own config with no overlap:

- npm test: unit tests only (vitest.config.ts)
- npm run test:snapshot: snapshot integration tests (vitest.snapshot.config.ts)
- npm run test:smoke: smoke tests (vitest.smoke.config.ts)
10 rounds of code simplification after the pipeline migration:

- Extract shared allText() test helper, replacing 52 duplicate
  definitions (allText/textOf/joinText) across 48 test files
- Extract shared executeAxeCommand to ui-automation/shared/axe-command.ts,
  removing ~500 lines of identical code across 10 UI automation tools
- Deduplicate header() construction across 22 tool files
- Deduplicate build stage label maps in event-formatting.ts
- Reuse formatHumanCompilerWarningEvent in formatGroupedWarnings
- Eliminate 2 barrel re-export files (responses/index.ts, validation/index.ts)
- Remove dead exports (summary builder, XcodebuildEvent alias)
- Remove dead doctor showAsciiLogo parameter
- Remove ~100+ redundant comments across test and tool files
- Normalize assertion patterns (toBeFalsy for success, toBe(true) for error)
- Fix formatting across all modified files

Net: -1700 lines, no behavioral changes.
- Fix 3-space indent for header params and icon-prefixed section lines
  to align with emoji grid
- Fix duplicate headers in xcodebuild error paths: build-utils no
  longer emits its own header when a pipeline is active
- Add deviceId to pipeline header params for device build-and-run
  and test tools
- Add -allowProvisioningUpdates to xcodebuild commands to prevent
  interactive keychain prompts corrupting CLI output
- Preserve test failure details and discovery in snapshot normalizer
  (removed over-aggressive TEST_FAILURE_BLOCK_REGEX and
  TEST_DISCOVERY_REGEX stripping)
- Fix section line splitting in debugging tools so multi-line output
  (stack, variables, lldb-command) is properly indented per line
- Add debugging happy-path snapshot tests (attach, stack, add-breakpoint,
  lldb-command, remove-breakpoint, detach) using live simulator debugger
- Add device error-path fixtures (stop, install, launch) and fix
  test--failure fixture naming
- Add @main to Calculator example app for Xcode 26.4 compatibility
- Regenerate all snapshot fixtures
Add withErrorHandling() higher-order wrapper and toErrorMessage() utility
to eliminate identical try/catch boilerplate repeated across nearly every
*Logic function in src/mcp/tools/.

The wrapper centralizes error normalization, optional logging, and
toolResponse assembly while preserving exact error messages via
per-tool options. Tools with class-specific error branching (UI
automation) or raw ToolResponse returns (get_*_app_path) use the
mapError escape hatch.

No behavior changes — all 1466 unit tests and 135 snapshot tests
pass with zero fixture updates.
…acy code

Convert 9 manual-handler tools to use createTypedToolWithContext, fixing a
validation bug in scaffold_ios_project and scaffold_macos_project where
.parse() threw unhandled ZodErrors instead of returning tool error responses.

- scaffold_ios_project, scaffold_macos_project: createTypedToolWithContext
- xcode_ide_call_tool, xcode_ide_list_tools: createTypedToolWithContext
- xcode_tools_bridge_{disconnect,status,sync}: createTypedToolWithContext
- session_show_defaults: createTypedToolWithContext
- swift_package_stop: createTypedToolWithContext

Remove legacy compatibility code:
- Remove default-export fallback from import-tool-module.ts
- Strip dead types from plugin-types.ts (PluginMeta, WorkflowGroup, etc.)
- Inline handler type in runtime/types.ts
- Delete unused workflow-selection.ts and its tests
Resources now use the same manifest-driven architecture as tools:
- YAML manifests in manifests/resources/ define metadata, URIs, and predicates
- Dynamic module import via import-resource-module.ts (named exports only)
- Predicate-aware registration filters resources using PredicateContext
- xcode-ide-state is now gated by runningUnderXcodeAgent predicate

Resource modules converted from default exports to named handler exports.
Resource registration in bootstrap.ts now receives PredicateContext.

Added resource visibility helpers to exposure.ts, resource manifest schema
and loader support, snapshot tests for 4 resources, and updated docs.
…setup

Production executor factories (getDefaultCommandExecutor, getDefaultFileSystemExecutor,
getDefaultInteractiveSpawner) no longer contain test-environment checks. Safety blocking
for unit tests is now handled by vitest-executor-safety.setup.ts which installs noop
overrides via beforeEach/afterEach, loaded only by the unit test vitest config.

- Remove all process.env.VITEST/NODE_ENV/SNAPSHOT_TEST_REAL_EXECUTOR checks from
  command.ts and interactive-process.ts
- Add interactive spawner override seam (__setTestInteractiveSpawnerOverride)
- Add createNoopInteractiveSpawner to mock-executors.ts
- Remove SNAPSHOT_TEST_REAL_EXECUTOR env toggles from snapshot harnesses
- Refactor swift_package_run.ts to remove test-env branching
- Update smoke harness to patch interactive spawner overrides
Document findings from investigating the structured event pipeline
migration status, renderer architecture rationale, and holdout tools.
…line

Re-migrate tools that were previously migrated to the pipeline (ac33b97f)
then reverted to manual text construction (c0693a1d). Extends the pipeline
infrastructure where needed to produce the fixture-defined output format
rather than bypassing the pipeline.

Infrastructure extensions:
- Add blankLineAfterTitle to SectionEvent for Errors/platform group spacing
- Add suppressCliStream to toolResponse() for late-bound CLI next steps
- Add extractQueryErrorMessages() structured parser in xcodebuild-error-utils
- Update formatSectionEvent() to honor blank line and empty line rendering

Tool migrations:
- get_sim_app_path: replace formatToolPreflight + manual text with events
- get_device_app_path: same migration pattern
- get_mac_app_path: same migration pattern
- list_devices: replace renderGroupedDevices() with section events per platform
- swift_package_run: remove manual error fallback content
- swift_package_test: remove manual error fallback content

All 1446 unit tests pass. All snapshot fixtures match unmodified.
…nt union

The renderer interface handles generic PipelineEvents used by all tools,
not just xcodebuild. Rename to PipelineRenderer to reflect actual usage.

Split PipelineEvent into CommonPipelineEvent (generic UI/output events)
and BuildTestPipelineEvent (xcodebuild/swift-toolchain-specific events)
to make the type boundary explicit. The full PipelineEvent union remains
as their combination for backward compatibility.
Normalize environment-specific data in resource-harness that caused
flaky doctor and device snapshot failures:
- argv and execPath (vary by node/vitest install location)
- nvm version in PATH entries
- node_modules/.bin PATH entries injected by npm at runtime
- Process tree line count (varies per run context)
- Device connection status checkmarks (depend on physical device state)
The build_run_sim, build_run_device, and build_run_macos orchestrators
inlined the same command construction and result parsing that existed
in their corresponding step tools (install, launch, boot). This created
duplication across 8 files that diverged independently.

Extract reusable internal primitives into three new modules:
- simulator-steps.ts: findSimulatorById, installAppOnSimulator, launchSimulatorApp
- device-steps.ts: installAppOnDevice, launchAppOnDevice
- macos-steps.ts: launchMacApp

Both orchestrators and step-tool handlers now delegate to these shared
primitives, following the existing handleTestLogic pattern used by the
test tools.
…alone logging

Remove the 5 standalone logging tools (start/stop device/sim log capture,
launch_app_logs_sim) and the logging workflow. Log capture is now automatic
in build_run and launch_app tools, with log file paths included in responses.

Adds derived-data-path and log-paths utilities to support the integrated
log capture approach. Tool count reduced from 76 to 71 across 13 workflows.
Remove the __flowdeck_fixtures__, __fixtures_designed__ directories,
all flowdeck test files, and the flowdeck harness. These alternative
fixture formats are no longer needed.
Enhance displayPath to use ~/ shorthand for paths under the user's home
directory when cwd-relative display isn't possible. Apply displayPath to
log paths in build-run results, launch-app details, and swift-package run
output. Update snapshot normalizer to handle ~/ for fixture compatibility.
Existing changes from prior work (home directory path collapsing,
simulator steps, snapshot normalization). These are the starting point.

Baseline: 5 pre-existing test failures, 1399 passing.
New files:
- src/rendering/types.ts: RenderSession, ToolHandlerContext, ImageAttachment types
- src/rendering/render.ts: createRenderSession(strategy), renderEvents()

Text strategy replicates mcp-renderer grouping/flush logic.
JSON strategy produces JSONL output.
Both track events, attachments, and error state.
Session and context infrastructure:
- RenderSession created by boundary, passed through invoker to handlers
- ToolHandlerContext available via AsyncLocalStorage (getHandlerContext)
- Factory wrappers detect void returns and build ToolResponse from session
- Backward compatible: existing handlers returning ToolResponse still work

Unit tests: 2 pre-existing failures, 1403 passing
Snapshot tests: 1 env-specific failure (doctor PATH), 131 passing
65 tool handlers migrated + 50 test files updated.
Unit tests: 2 pre-existing failures, 1403 passing.
Snapshot tests: 60 failures from double-rendering (CLI boundary
not yet updated — needs steps 6-7).
toolResponse() now skips CLI renderers when running inside the factory's
AsyncLocalStorage context. This prevents double-rendering for migrated
tools.

Remaining: 7 snapshot failures from CLI text renderer formatting
differences. The render session uses MCP renderer formatting, but
CLI subprocess tests expect CLI text renderer formatting.
This requires CLI boundary (step 6) to be implemented.
…s 3+6)

Migrated 65 non-pipeline tool handlers to ctx.emit pattern via
getHandlerContext(). All tools use AsyncLocalStorage to access the
handler context without positional parameter conflicts.

CLI boundary: printToolResponse re-renders session events through the
CLI text renderer for proper spacing. Next-steps appended as events
in _meta for unified rendering.

Double-rendering prevented by suppressing CLI renderer in toolResponse()
when inside handler context.

Unit tests: 2 pre-existing failures, 1403 passing
Snapshot tests: 1 env-specific failure (doctor PATH), 131 passing
- Pipeline accepts emit callback from handler context
- Inline finalization replaces pending pattern
- Parser debug file ref defaults to false
- Error fallback policy defaults to if-no-structured-diagnostics

Unit tests: 2 pre-existing failures, 1402 passing
Snapshot tests: 10 failures to investigate
Pipeline tools migrated from pending pattern to inline finalization
with ctx.emit. Updated swift-package build error fixture to reflect
improved structured error output (errors section + summary + log path).

Unit tests: 2 pre-existing failures, 1402 passing
Snapshot tests: 1 env-specific failure (doctor PATH), 131 passing
Added normalizations for volatile doctor output values:
- Node.js version, OS release, CPU, memory
- Xcode version and paths, dependency versions
- TMPDIR, homedir, full PATH section
- Tool counts and server version

Updated debugging stack trace fixture for current runtime frames.
Updated swift-package build error fixture for structured error output.

All snapshot tests pass. Unit tests: 2 pre-existing failures, 1402 passing.
- Run docs:update for TOOLS.md and TOOLS-CLI.md
- Update debugging stack trace fixture for current runtime frames
Technical debt from the refactor that can be addressed incrementally:
- Remove hybrid toolResponse() usage from withErrorHandling mapError paths
- Clean up ToolResponse return types in tool handler signatures
- Daemon protocol v2 (events over wire)
- Delete dead renderer code
- Encapsulate ToolResponse type
Removed the IIFE + toolResponse() + _meta.events extraction pattern
from ~40 tool handlers. Tools now emit events directly via ctx.emit()
without the intermediate ToolResponse construction.

Remaining toolResponse() callers are legitimate:
- mapError callbacks in withErrorHandling (returns ToolResponse by design)
- Dual-mode tools (list_sims, list_devices) for non-ctx fallback
- Pipeline tool error paths
- External-facing functions (doctor, xcode-ide bridge)

Unit tests: 2 pre-existing failures, 1402 passing
Snapshot tests: all 13 files pass
- Added emit function to mapError callback for direct event emission
- Converted all mapError callbacks from toolResponse() to emit()
- Converted pipeline pre-flight errors to ctx.emit()
- Converted record_sim_video to ctx.emit()
- Updated record_sim_video tests for new pattern

toolResponse() removed from 48 of 52 tool handler files. Remaining 4
are legitimate external API boundaries:
- doctor.ts (external callers: doctor-cli, resources)
- xcode-ide/shared.ts (bridge callers)
- list_sims.ts, list_devices.ts (resource callers without handler context)

Unit tests: 2 pre-existing failures, 1402 passing
Snapshot tests: all 13 files pass
Zero toolResponse() callers remain in src/mcp/tools/.

- doctor.ts: runDoctor returns events via eventsToToolResponse()
- xcode-ide/shared.ts: withBridgeToolHandler emits via ctx directly
- list_sims.ts, list_devices.ts: removed dual-mode fallback, always
  use ctx.emit (resources already set up AsyncLocalStorage)
- New utility: src/utils/events-to-tool-response.ts for callers that
  need ToolResponse without the old renderer pipeline

Unit tests: 2 pre-existing failures, 1402 passing
Snapshot tests: all 13 files pass
… 10)

Deleted dead code:
- src/utils/tool-response.ts (zero callers)
- src/utils/renderers/mcp-renderer.ts (zero callers)
- src/utils/renderers/cli-jsonl-renderer.ts (zero callers)
- src/utils/__tests__/tool-response.test.ts
- src/utils/renderers/__tests__/mcp-renderer.test.ts

Cleaned up:
- renderers/index.ts: removed resolveRenderers() and dead imports
- xcodebuild-pipeline.ts: removed renderer fallback (emit is required)
- xcodebuild-output.ts: removed mcpContent from PipelineResult
- Pipeline tests updated to provide emit callback

toolResponse() is gone. Events flow through ctx.emit only.

Unit tests: 2 pre-existing failures, 135 files pass
Snapshot tests: all 13 files pass, 133 tests pass
6 build/build-run tool handlers no longer reference ToolResponse.
Return types narrowed from Promise<ToolResponse | void> to Promise<void>.

7 files legitimately retain ToolResponse:
- doctor, xcode-ide/shared (external API)
- list_sims, list_devices (dual-mode for resources)
- test_sim/device/macos (handleTestLogic return type)
…inal)

- test_sim/device/macos: handleTestLogic returns void, emits via ctx
- list_sims/list_devices: removed dual-mode fallback, always emit via ctx
- doctor: runDoctor returns inferred type, uses renderEvents
- build tool return types narrowed to Promise<void>

Only src/mcp/tools/xcode-ide/shared.ts retains ToolResponse — it
receives (not constructs) a ToolResponse from the external Xcode bridge.

toolResponse() deleted. resolveRenderers() deleted.
mcp-renderer.ts deleted. cli-jsonl-renderer.ts deleted.

Unit tests: 2 pre-existing failures, 135 pass
Snapshot tests: all 13 files pass, 133 tests pass
Bridge now returns BridgeToolResult (events + images + metadata)
instead of ToolResponse. shared.ts emits bridge events directly
via ctx without unpacking a ToolResponse wrapper.

Zero ToolResponse references remain in src/mcp/tools/.
ToolResponse now exists only at transport boundaries:
- types/common.ts (definition)
- tool-registry.ts (MCP envelope construction)
- daemon protocol/server/client (wire format)
- runtime/tool-invoker.ts (postProcessToolResponse for MCP boundary)

Eliminated from:
- CLI output (printSessionOutput replaces printToolResponse)
- CLI register-tool-commands (reads from session directly)
- CLI tool-catalog (emits into session)
- typed-tool-factory (returns void, no ToolResponse construction)
- tool-error-handling (legacy non-ctx overload deleted)
- build-utils (returns BuildCommandResult, not ToolResponse)
- list_sims/list_devices (dual-mode removed, always use ctx)
- resources/simulators/devices (set up AsyncLocalStorage)

Deleted: src/utils/events-to-tool-response.ts (zero callers)

Unit tests: 2 pre-existing failures, 1394 passing
Snapshot tests: 2 env-specific failures (no physical device), 131 passing
RenderSession.emit() now returns void instead of TextRenderOp, removing
the dual text/event tracking pattern. CLI output modes (text, json) are
handled by dedicated render sessions (cli-text, cli-json) that write to
stdout directly, eliminating the printSessionOutput intermediary.

Daemon protocol bumped to v2 with DaemonToolResult replacing ToolResponse
on the wire. Added DaemonVersionMismatchError and auto-restart logic so
CLI transparently recovers when connecting to a stale daemon.

Removed postProcessToolResponse in favor of postProcessSession which
operates on sessions directly across MCP, daemon, and snapshot harness
boundaries.
The tool factory functions already returned a ToolTestResult-shaped object
at runtime when called without a ToolHandlerContext (test mode), but lied
to TypeScript via `as unknown as void`. This caused 350+ typecheck errors
across test files accessing .content and .isError on void.

Add ToolHandler overloaded interface that reflects the actual runtime
behavior: void with context (production), ToolTestResult without (tests).
Update callHandler return type and allText helper to accept broader
content types. Fix miscellaneous test type issues (findLast, readonly
properties, mock return types).
resolveAppPid scanned all output lines with a bracket regex that matched
3+ digit numbers like [404], causing HTTP status codes to be returned as
PIDs. Restrict to checking only the first non-empty line using the colon
format (e.g. com.example.app: 42567), which is the actual simctl output
pattern.
The tilde replacement only handled ~/path patterns. Add a rule for bare
~ (preceded by whitespace or colon, followed by whitespace or EOL) so
that outputs like 'Home: ~' are normalized to 'Home: <HOME>'.
@sentry
Copy link
Copy Markdown

sentry bot commented Apr 8, 2026

🚧 Skipped: PR exceeds review size limit.

Please split into smaller PRs and re-run.
Reference ID: 13026783

@cameroncooke cameroncooke changed the title ref: Unify rendering pipeline and eliminate ToolResponse from tool handlers ref: Unify rendering pipeline with structured output and snapshot tests Apr 8, 2026
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 8, 2026

Open in StackBlitz

npm i https://pkg.pr.new/xcodebuildmcp@318

commit: 8c3b1d9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant